Various methods for search and retrieval of information, such as by a search engine over a network, typically employ text-based searching in which queried words or phrases are compared to an index or other data structure to identify webpages, documents, images, and the like that include matching or semantically similar textual content, metadata, file names, or other textual representations. Such methods of text-based searching work relatively well for text-based documents, however they are difficult to apply to image files and data.
Search engines and algorithms employed for text based searching cannot search image files based on the content of the image and thus, are limited to identifying search result images based only on the data associated with the images. Methods for content-based searching of images have been developed that utilize analysis of the content of the images to identify image features of the image and/or visually similar images. These methods, however, are inefficient and do not scale well to large-scale searches, wherein, for example, millions or even billions of images must be quickly searched to identify and provide search result images to a user. Thus, in order to search large databases of image files, a text-based query is typically required, necessitating that the image file be associated with one or more textual elements such as a title, file name, or other metadata or tags.
Typical image files, in addition to image content data, include limited textual information in the form of metadata. This conventional metadata includes attribute tags or information describing the image according to various standards, for example the Exchangeable Image File Format (EXIF) standard. EXIF data is widely used by many digital cameras to embed technical metadata into image files they create. A primary feature of EXIF is its ability to record camera information in an image file at the point of capture. Some common data fields include the camera make and model, its serial number, the date and time of image capture, GPS coordinates of where image capture occurred, the shutter speed, the aperture, the lens used and the ISO speed setting. EXIF metadata can include other technical details, such as white balance and distance to the subject. Other textual information associated with an image file is typically required to be input by a human.